11 research outputs found

    Machine learning in prediction of intrinsic aqueous solubility of drug‐like compounds: Generalization, complexity, or predictive ability?

    Get PDF
    We present a collection of publicly available intrinsic aqueous solubility data of 829 drug‐like compounds. Four different machine learning algorithms (random forests [RF], LightGBM, partial least squares, and least absolute shrinkage and selection operator [LASSO]) coupled with multistage permutation importance for feature selection and Bayesian hyperparameter optimization were used for the prediction of solubility based on chemical structural information. Our results show that LASSO yielded the best predictive ability on an external test set with a root mean square error (RMSE) (test) of 0.70 log points, an R2(test) of 0.80, and 105 features. Taking into account the number of descriptors as well, an RF model achieves the best balance between complexity and predictive ability with an RMSE(test) of 0.72 log points, an R2(test) of 0.78, and with only 17 features. On a more aggressive test set (principal component analysis [PCA]‐based split), better generalization was observed for the RF model. We propose a ranking score for choosing the best model, as test set performance is only one of the factors in creating an applicable model. The ranking score is a weighted combination of generalization, number of features, and test performance. Out of the two best learners, a consensus model was built exhibiting the best predictive ability and generalization with RMSE(test) of 0.67 log points and a R2(test) of 0.81

    Quantification of coronary atherosclerotic burden with coronary computed tomography angiography: adapted Leaman score in Croatian patients

    Get PDF
    The aim of the study was to quantify the total coronary atherosclerotic burden in patients with suspected coronary artery disease (CAD) defined by coronary computed tomography adapted Leaman score (CT-LeSc) and to estimate its cut-off level for high coronary atherosclerotic burden. We enrolled 434 consecutive patients referred to coronary computed tomography angiography, of which 261 patients fulfilled the study inclusion criteria. Demographic and clinical characteristics, as well as CAD risk factors were obtained. CAD pre-test probabilities were estimated by the Diamond-Forrester model and Morise score. The coronary atherosclerotic burden was estimated using CT-LeSc. As a cut-off for a high coronary atherosclerotic burden, we used 3rd tercile (Tc3) (CT-LeSc ≥ 5.52). We evaluated the association of clinical characteristics and risk factors with Tc3 in univariate and multivariate analysis. There were 60.9% males and 39.1% females, 81% of patients had above-normal weight, 68.2% hypertension, 54.0% dyslipidemia, 15.3% diabetes mellitus, 12.3% positive smoking history and 11.9% had a family history of CAD. According to the Diamond-Forrester model and Morise score the majority of patients had intermediate risk, 59.7 and 52.8%, followed by the high-risk group, 36.0 and 34.4%, respectively. Age, dyslipidemia, hypertension and pre-test risk scores in the univariate analysis significantly predicted Tc3. In the multivariate analysis, male sex (p = 0.004), dyslipidemia (p = 0.002) and coronary calcium score (< 0.001) were identified as predictors of Tc3. CT-LeSc quantified the total coronary atherosclerotic burden and showed an association of risk factors and pre-test probabilities with Tc3

    Lipophilicity determination of antifungal isoxazolo[3,4-b]pyridin-3(1h)-ones and their n1-substituted derivatives with chromatographic and computational methods

    Get PDF
    The lipophilicity of a molecule is a well-recognized as a crucial physicochemical factor that conditions the biological activity of a drug candidate. This study was aimed to evaluate the lipophilicity of isoxazolo[3,4-b]pyridine-3(1H)-ones and their N1-substituted derivatives, which demonstrated pronounced antifungal activities. Several methods, including reversed-phase thin layer chromatography (RP-TLC), reversed phase high-performance liquid chromatography (RP-HPLC), and micellar electrokinetic chromatography (MEKC), were employed. Furthermore, the calculated logP values were estimated using various freely and commercially available software packages and online platforms, as well as density functional theory computations (DFT). Similarities and dissimilarities between the determined lipophilicity indices were assessed using several chemometric approaches. Principal component analysis (PCA) indicated that other features beside lipophilicity affect antifungal activities of the investigated derivatives. Quantitative-structure-retention-relationship (QSRR) analysis by means of genetic algorithm - partial least squares (GA-PLS) - was implemented to rationalize the link between the physicochemical descriptors and lipophilicity. Among the studied compounds, structure 16 should be considered as the best starting structure for further studies, since it demonstrated the lowest lipophilic character within the series while retaining biological activity. Sum of ranking differences (SRD) analysis indicated that the chromatographic approach, regardless of the technique employed, should be considered as the best approach for lipophilicity assessment of isoxazolones

    Non-Linear Quantitative Structure–Activity Relationships Modelling, Mechanistic Study and In-Silico Design of Flavonoids as Potent Antioxidants

    No full text
    In this work, we developed quantitative structure&#8211;activity relationships (QSAR) models for prediction of oxygen radical absorbance capacity (ORAC) of flavonoids. Both linear (partial least squares&#8212;PLS) and non-linear models (artificial neural networks&#8212;ANNs) were built using parameters of two well-established antioxidant activity mechanisms, namely, the hydrogen atom transfer (HAT) mechanism defined with the minimum bond dissociation enthalpy, and the sequential proton-loss electron transfer (SPLET) mechanism defined with proton affinity and electron transfer enthalpy. Due to pronounced solvent effects within the ORAC assay, the hydration energy was also considered. The four-parameter PLS-QSAR model yielded relatively high root mean square errors (RMSECV = 0.783, RMSEE = 0.668, RMSEP = 0.900). Conversely, the ANN-QSAR model yielded considerably lower errors (RMSEE = 0.180 &#177; 0.059, RMSEP1 = 0.164 &#177; 0.128, and RMSEP2 = 0.151 &#177; 0.114) due to the inherent non-linear relationships between molecular structures of flavonoids and ORAC values. Five-fold cross-validation was found to be unsuitable for the internal validation of the ANN-QSAR model with a high RMSECV of 0.999 &#177; 0.253; which is due to limited sample size where resampling with replacement is a considerably better alternative. Chemical domains of applicability were defined for both models confirming their reliability and robustness. Based on the PLS coefficients and partial derivatives, both models were interpreted in terms of the HAT and SPLET mechanisms. Theoretical computations based on density functional theory at &#969;b97XD/6-311++G(d,p) level of theory were also carried out to further shed light on the plausible mechanism of anti-peroxy radical activity. Calculated energetics for simplified models (genistein and quercetin) with peroxyl radical derived from 2,2&#8242;-azobis (2-amidino-propane) dihydrochloride suggested that both SPLET and single electron transfer followed by proton loss (SETPL) mechanisms are competitive and more favorable than HAT in aqueous medium. The finding is in good accord with the ANN-based QSAR modelling results. Finally, the strongly predictive ANN-QSAR model was used to predict antioxidant activities for a series of 115 flavonoids designed combinatorially with flavone as a template. Structural trends were analyzed, and general guidelines for synthesis of new flavonoid derivatives with potentially potent antioxidant activities were given

    The analysis of waiting time and utilization of computed tomography and magnetic resonance imaging in Croatia: a nationwide survey

    No full text
    Aim: To assess the variation in the waiting time for diagnostic imaging (DI) services among Croatian public hospitals and the utilization of computed tomography (CT) and magnetic resonance imaging (MRI) scanners. ----- Methods: We analyzed aggregated data from public hospitals. Counties were classified according to economic strength, and utilization was expressed as the average number of exams per machine. We compared the waiting times for 2018 and utilization for 2015 according to hospital category (high and low level) and economic strength by county. ----- Results: The waiting time was longer for MRI compared with CT, 268 vs 77.61 days. Overall CT waiting time was in the unfavorable European Health Consumer Index category. High-level hospitals had longer waiting time for MRI and CT. The waiting time positively correlated with economic strength for MRI (P=0.019), but not for CT. In low-level hospitals, MRI utilization ranged from 104 to 6032, whereas CT utilization ranged from 48 to 17852. In high-level hospitals, MRI utilization ranged from 3846 to 11 026, while CT utilization ranged from 503 to 17 234. CT (P=0.041) and MRI (P=0.031) utilization in high-level hospitals was significantly higher than in low-level hospitals. ----- Conclusion: The waiting times for CT and MRI were exceptionally long regardless of the hospital category, with highly varying utilization. Croatia performed more exams per scanner compared with other EU countries, but not significantly so. High-level hospitals' utilization was significantly higher than that of low-level hospitals, and CT utilization was significantly higher than EU average, while the difference for MRI utilization was not significant

    Geographical and Temporal Distribution of Radiologists, Computed Tomography and Magnetic Resonance Scanners in Croatia

    No full text
    The aim of the study was to analyse the temporal and geographic distribution of radiologists, computed tomography and magnetic resonance scanners in Croatia. In this observational study we estimated radiologists’ number per 100,000 population for 1997, 2006, and 2017 and compared private and public CT and MR scanners between 2011 and 2018. We analyzed the availability of radiologists and scanners, and the relationship between the radiological workforce and economic strength among counties. The workforce increased significantly from 1997 to 2017 and was associated with economic strength categories in 2017. In 2018, there were more CT scanners in the public sector, while MR scanners were distributed evenly. In 2011, there was similar distribution of CT and MR between sectors, while in 2018 there were significantly more public CT scanners. Counties with a medical school had significantly more radiologists and MR scanners. The high-to-low ratios per CT and MR were 11 and 8.2, suggesting inequality of health care. Croatia significantly increased its radiological workforce; however, cross-county inequality remained. Counties with higher economic strength and medical schools have better availability of radiologists and equipment. To ensure the sustainable activity of the health care system, a precise estimate of supply and demand of radiology services is needed

    Machine Learning in Prediction of Intrinsic Aqueous Solubility of Drug-like Compounds: Generalization, Complexity or Predictive Ability?

    No full text
    Here, we present a collection of publicly availableintrinsic aqueous solubility data of 829 drug-likecompounds. Four different machine learning algorithms(random forest, light GBM, partial least squares andLASSO) coupled with multi-stage permutationimportance for feature selection and Bayesian hyperparameter optimization were employed for theprediction of solubility based on chemical structuralinformation. Our results have shown that LASSOyielded the best predictive ability on an external test setwith and RMSE(test) of 0.70 log points and 105 featuresin the model. Taking into account the number ofdescriptors as well, an RF model achieved the bestbalance between complexity and predictive ability withan RMSE(test) of 0.72 with only 17 features. Wepropose a ranking score for choosing the best model, astest set performance is only one of the factors in creatingan applicable model. The ranking score is a weightedcombination of generalization, number of featuresinvolved and test set performance The data related to this paper can be downloaded from 10.5281/zenodo.3968754</div

    Target-based drug discovery through inversion of quantitative structure-drug-property relationships and molecular simulation: CA IX-sulphonamide complexes

    No full text
    In this work, a target-based drug screening method is proposed exploiting the synergy effect of ligand-based and structure-based computer-assisted drug design. The new method provides great flexibility in drug design and drug candidates with considerably lower risk in an efficient manner. As a model system, 45 sulphonamides (33 training, 12 testing ligands) in complex with carbonic anhydrase IX were used for development of quantitative structure-activity-lipophilicity (property)-relationships (QSPRs). For each ligand, nearly 5,000 molecular descriptors were calculated, while lipophilicity (logkw) and inhibitory activity (logKi) were used as drug properties. Genetic algorithm-partial least squares (GA-PLS) provided a QSPR model with high prediction capability employing only seven molecular descriptors. As a proof-of-concept, optimal drug structure was obtained by inverting the model with respect to reference drug properties. 3509 ligands were ranked accordingly. Top 10 ligands were further validated through molecular docking. Large-scale MD simulations were performed to test the stability of structures of selected ligands obtained through docking complemented with biophysical experiments
    corecore